QVAC-18612 infra: gate every secret-bearing workflow with label-gate#1997
Conversation
Throwaway helper that gates every secret-bearing job in a workflow on
the local `label-gate` composite action. Used to generate the diff in
the next commit; committed for reviewer reproducibility (apply HEAD~1,
run the script on the same target list, diff against HEAD).
Detection heuristic ("secret-bearing job"):
- explicit `environment:`, OR
- any `${{ secrets.<X> }}` other than `GITHUB_TOKEN`, OR
- `secrets: inherit` on a `workflow_call`, OR
- a non-empty per-call `secrets:` mapping.
Implementation notes:
- ruamel.yaml is used only to identify line ranges and YAML semantics;
all edits are line-based to preserve comments, quoting, indentation,
and ordering exactly.
- Idempotent: re-runs detect the existing `label-gate:` job and skip.
- Folded `if: >-` scalars are collapsed to a single line; original
`${{ ... }}`-wrapped expressions are unwrapped before composing
(GHA evaluates `if: <bare> && ${{ <expr> }}` as an always-true
string literal).
Co-authored-by: Cursor <cursoragent@cursor.com>
Inserts a `label-gate` job at the top of `jobs:` in every secret-bearing
workflow and updates each downstream secret-bearing job to require
`needs: [..., label-gate]` and `if: needs.label-gate.outputs.authorised
== 'true' && <existing>`.
110 workflow files migrated via the throwaway scripts/migrate_label_gate.py
introduced in the previous commit. Net: 2,809 insertions, 695 deletions.
Coverage: every job in this repo that sets `environment:`, references
`${{ secrets.<X> }}` (other than `GITHUB_TOKEN`), uses `secrets: inherit`
on a `workflow_call`, or maps secrets explicitly into a reusable
workflow now passes through `label-gate` before any secret-touching step
runs.
Pre-existing `authorize-pr` peer jobs (16 workflows) are preserved
alongside the new gate; both must authorise for downstream jobs to run
(belt-and-suspenders during the staged migration). Removal of the
authorize-pr layer lands in a follow-up.
actionlint clean post-migration; zero new warnings introduced.
Idempotent: running the migration script again is a no-op.
Co-authored-by: Cursor <cursoragent@cursor.com>
Preview deployments for qvac-docs-staging ⚡️
Commit: Deployment ID: Static site name: |
|
/review |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
What problem does this PR solve?
label-gatecomposite action (added in QVAC-18608 infra: add .github/actions/label-gate (Node 20) #1968 and hardened in QVAC-18608 fix(label-gate): preserve hyphens in input env-var names #1973, QVAC-18608 fix(label-gate): strip label when non-trusted user applies it #1978, QVAC-18608 fix: trust repository_dispatch events in label-gate #1995) only protects the workflows that call it. Until now, just thevulkaninfo.ymlcanary did. Every other secret-bearing workflow in this repo still relies solely on the olderauthorize-prteam check, or on nothing at all when triggered frompull_request/pull_request_target.verifiedlabel the single PR-side gate for every secret-bearing workflow so a stranger's PR cannot reachsecrets.PAT_TOKEN, NPM publish credentials, AWS Device Farm, etc., until a member ofqvac-internal-release(or one of the other authorised teams) explicitly applies the label.How does it solve it?
label-gatejob at the top of every secret-bearing workflow. The job runs the local./.github/actions/label-gatecomposite, which fail-closes on any unrecognised event and on any PR-context event whoseverifiedlabel was not applied by a trusted actor (and strips the label on misuse, per QVAC-18608 fix(label-gate): strip label when non-trusted user applies it #1978). The action itself defaults to the three trusted teams (qvac-internal-dev,qvac-internal-merge,qvac-internal-release) — the per-workflow gate jobs do not overrideteams:/users:, so the trust policy lives in exactly one place (.github/actions/label-gate/action.yml).label-gateand AND itsif:withneeds.label-gate.outputs.authorised == 'true'. Existingauthorize-prchecks are preserved alongside the new gate (belt-and-suspenders during the staged migration the task spec calls out —authorize-prremoval lands in a follow-up).ruamel.yamlpurely to identify line ranges and surrounding YAML semantics; all edits are line-based to preserve comments, ordering, quoting, and indentation exactly. Detection heuristic for "secret-bearing" job: explicitenvironment:, OR any${{ secrets.<X> }}other thanGITHUB_TOKEN, ORsecrets: inheriton aworkflow_call, OR a non-empty per-callsecrets:mapping. Jobs hard-disabled withif: falseare skipped (gating is a no-op). The two approval-machinery workflows (see below) are skipped via an explicit exemption.label-gatebefore any secret-touching step runs.Approval-machinery exemption
Two workflows are intentionally NOT gated and remain byte-identical to
main:.github/workflows/approval-worker.yml.github/workflows/approval-check-worker.ymlThese are the tier-based approval bot itself — triggered by
issue_comment/pull_request_reviewto compute theTier-based Approval Checkstatus check. Gating them withlabel-gateproduces a deadlock incident:Tier-based Approval Checkstatus until it has theverifiedlabel.verifiedis applied after human review.verifiedwould never be requested (no signal to reviewers that the PR is ready), and even if it were applied, the approval bot would only re-evaluate on the next review/comment event.These workflows are part of the gate machinery itself and must run ungated, the same way
authorize-prruns ungated.if: falsejobsJobs with a hard-disabled
if: false(e.g. thenotifyjob indocs-deploy-notify.yml) are deliberately not rewritten — the gate guard would have clobbered the inline explanation comment for zero behavioural change. If a workflow's only secret-bearing job is hard-disabled, the file is left untouched (no orphan label-gate job inserted).Net workflow scope
main(deadlock prevention).DOCS_SYNC_PAT/AI_AUGMENT_API_KEY).if: false).yamlfmt cleanup (commit
cf1311f1, narrowed by4a8700ffand15949925)The label-gate fan-out touches
.github/workflows/*onnx*.yml, which matcheson-pr-onnx.yml'spaths:filter. That workflow runsyamlfmt v0.17.0against the entire repo at defaultworkdir, so any pre-existing workflow yamlfmt drift would block this PR from merging. The cleanup was therefore bundled in for the workflows directory only.cf1311f1— yamlfmtv0.17.0 -formatter retain_line_breaks_single=trueacross.github/workflows/and.github/actions/.4a8700ff— reverts the composite-action portion ofcf1311f1. Theretain_line_breaks_single=trueformatter inserts#magic___^_^___linemarkers throughout collapsed folded scalars and folds multi-linedescription:blocks into long single lines — review noise unrelated to the gate change.15949925— reverts a 4-filepackages/transcription-whispercpp/sweep that had been included for the same yamlfmt-CI reason. Touching package paths triggers expensive build pipelines wholly unrelated to this PR.Composite-action and packages drift remains pre-existing on
mainand will be addressed in a dedicated yamlfmt cleanup PR. Trade-off: theon-pr-onnxyamlfmt sub-check may flag the unchanged drift in those areas — identical to pre-existing CI noise onmain, not a regression introduced by this PR.Net diff scope: 123 files, all under
.github/workflows/. Zero composite actions, zero packages, zero scripts.How was it tested?
actionlintclean against the full.github/workflows/tree post-migration. Remaining warnings (shellcheck reported,runner-label, composite-actiontypekey, workflow-call signature mismatches) all exist onorigin/main— verified by diffing the warning sets. Zero new lint warnings introduced by the migration or the yamlfmt cleanup.MIGRATEDlines.trigger-docs-translation-nmtcpp.yml), Pattern B with peerauthorize-pr(publish-sdk.yml,on-pr-bci-whispercpp.yml), reusableworkflow_callwithsecrets: inherit(test-android-sdk.yml,cpp-lint.yaml), foldedif: >-blocks (trigger-reusable-lib-cli.yml,benchmark-ocr-onnx.yml),if: always()cleanup paths (test-android-sdk.yml'scleanup-device-farm), and the inline-stepauthorize-proutlier (on-pr-test-sdk.yml— the script gates the downstreamrun-testsworkflow_call correctly without touching the inline step).if: ${{ ... }}-wrapping bug caught and fixed during validation. GHA evaluatesif: <bare> && ${{ <expr> }}as a string literal (always-true). The script now unwraps the outer${{ }}from the original expression before composing, so the resultingif:is fully bare.vulkaninfo.yml) currently authorises and skips correctly under the rebased label-gate behaviour. Full live validation against THIS PR will use the 5-step matrix (no-label deny / label authorise / push-while-labeled / unlabel-then-push deny / full actionlint).Action pinning
actions/checkout: each newlabel-gatejob pins tode0fac2e4500dabe0009e67214ff5f5447ce83dd # 6.0.2— the same SHA every other workflow in this repo already uses. No version drift.Permissions changes
label-gatejob in each of the 108 gated workflow files.permissions:blocks are unchanged).contents: readfor the sparse-checkout of.github/actions/label-gate;pull-requests: writebecause the gate strips theverifiedlabel when a non-trusted actor applies it (per QVAC-18608 fix(label-gate): strip label when non-trusted user applies it #1978). Workflows that already declare a top-levelpermissions:block keep that block untouched — the per-job grant onlabel-gatedoes not propagate to other jobs.Known pre-existing CI noise (not introduced by this PR)
Tier-based Approval Checkfailing — the bot reports❌ Tier 1 requirements not met. Need: 1 Team Member (0/1) + 1 TL/Management (0/1). This is the expected state for a PR with zero approvals; it will go green once reviewers approve..github/actions/run-lint-and-unit-tests/action.yaml(actions/untrusted-checkout/critical, ×4) and.github/workflows/publish-sdk.yml(actions/artifact-poisoning/critical, ×2). All 6 are pre-existing open alerts onmain(created 2026-04-20 and 2026-05-01) — verified viagh api repos/tetherto/qvac/code-scanning/alerts. CodeQL re-fires them on this PR because the surrounding lines moved (yamlfmt + label-gate insertion shifted line numbers); the underlying alerts will resolve as code-scanning duplicates againstmainonce a reviewer attests. This PR introduces zero new CodeQL findings.